33 research outputs found
Appearance-Based Gaze Estimation in the Wild
Appearance-based gaze estimation is believed to work well in real-world
settings, but existing datasets have been collected under controlled laboratory
conditions and methods have been not evaluated across multiple datasets. In
this work we study appearance-based gaze estimation in the wild. We present the
MPIIGaze dataset that contains 213,659 images we collected from 15 participants
during natural everyday laptop use over more than three months. Our dataset is
significantly more variable than existing ones with respect to appearance and
illumination. We also present a method for in-the-wild appearance-based gaze
estimation using multimodal convolutional neural networks that significantly
outperforms state-of-the art methods in the most challenging cross-dataset
evaluation. We present an extensive evaluation of several state-of-the-art
image-based gaze estimation algorithms on three current datasets, including our
own. This evaluation provides clear insights and allows us to identify key
research challenges of gaze estimation in the wild
Gaze estimation and interaction in real-world environments
Human eye gaze has been widely used in human-computer interaction, as it is a promising modality for natural, fast, pervasive, and non-verbal interaction between humans and computers. As the foundation of gaze-related interactions, gaze estimation has been a hot research topic in recent decades. In this thesis, we focus on developing appearance-based gaze estimation methods and corresponding attentive user interfaces with a single webcam for challenging real-world environments. First, we collect a large-scale gaze estimation dataset, MPIIGaze, the first of its kind, outside of controlled laboratory conditions. Second, we propose an appearance-based method that, in stark contrast to a long-standing tradition in gaze estimation, only takes the full face image as input. Second, we propose an appearance-based method that, in stark contrast to a long-standing tradition in gaze estimation, only takes the full face image as input. Third, we study data normalisation for the first time in a principled way, and propose a modification that yields significant performance improvements. Fourth, we contribute an unsupervised detector for human-human and human-object eye contact. Finally, we study personal gaze estimation with multiple personal devices, such as mobile phones, tablets, and laptops.Der Blick des menschlichen Auges wird in Mensch-Computer-Interaktionen verbreitet eingesetzt, da dies eine vielversprechende Möglichkeit für natürliche, schnelle, allgegenwärtige und nonverbale Interaktion zwischen Mensch und Computer ist. Als Grundlage von blickbezogenen Interaktionen ist die Blickschätzung in den letzten Jahrzehnten ein wichtiges Forschungsthema geworden. In dieser Arbeit konzentrieren wir uns auf die Entwicklung Erscheinungsbild-basierter Methoden zur Blickschätzung und entsprechender “attentive user interfaces” (die Aufmerksamkeit des Benutzers einbeziehende Benutzerschnittstellen) mit nur einer Webcam für anspruchsvolle natürliche Umgebungen. Zunächst sammeln wir einen umfangreichen Datensatz zur Blickschätzung, MPIIGaze, der erste, der außerhalb von kontrollierten Laborbedingungen erstellt wurde. Zweitens schlagen wir eine Erscheinungsbild-basierte Methode vor, die im Gegensatz zur langjährigen Tradition in der Blickschätzung nur eine vollständige Aufnahme des Gesichtes als Eingabe verwendet. Drittens untersuchen wir die Datennormalisierung erstmals grundsätzlich und schlagen eine Modifizierung vor, die zu signifikanten Leistungsverbesserungen führt. Viertens stellen wir einen unüberwachten Detektor für Augenkontakte zwischen Mensch und Mensch und zwischen Mensch und Objekt vor. Abschließend untersuchen wir die persönliche Blickschätzung mit mehreren persönlichen Geräten wie Handy, Tablet und Laptop
Unsupervised Gaze-aware Contrastive Learning with Subject-specific Condition
Appearance-based gaze estimation has shown great promise in many applications
by using a single general-purpose camera as the input device. However, its
success is highly depending on the availability of large-scale well-annotated
gaze datasets, which are sparse and expensive to collect. To alleviate this
challenge we propose ConGaze, a contrastive learning-based framework that
leverages unlabeled facial images to learn generic gaze-aware representations
across subjects in an unsupervised way. Specifically, we introduce the
gaze-specific data augmentation to preserve the gaze-semantic features and
maintain the gaze consistency, which are proven to be crucial for effective
contrastive gaze representation learning. Moreover, we devise a novel
subject-conditional projection module that encourages a share feature extractor
to learn gaze-aware and generic representations. Our experiments on three
public gaze estimation datasets show that ConGaze outperforms existing
unsupervised learning solutions by 6.7% to 22.5%; and achieves 15.1% to 24.6%
improvement over its supervised learning-based counterpart in cross-dataset
evaluations
Learning to Find Eye Region Landmarks for Remote Gaze Estimation in Unconstrained Settings
Conventional feature-based and model-based gaze estimation methods have
proven to perform well in settings with controlled illumination and specialized
cameras. In unconstrained real-world settings, however, such methods are
surpassed by recent appearance-based methods due to difficulties in modeling
factors such as illumination changes and other visual artifacts. We present a
novel learning-based method for eye region landmark localization that enables
conventional methods to be competitive to latest appearance-based methods.
Despite having been trained exclusively on synthetic data, our method exceeds
the state of the art for iris localization and eye shape registration on
real-world imagery. We then use the detected landmarks as input to iterative
model-fitting and lightweight learning-based gaze estimation methods. Our
approach outperforms existing model-fitting and appearance-based methods in the
context of person-independent and personalized gaze estimation
Ground-state Properties and Bogoliubov Modes of a Harmonically Trapped One-Dimensional Quantum Droplet
We study the stationary and excitation properties of a one-dimensional
quantum droplet in the two-component Bose mixture trapped in a harmonic
potential. By constructing the energy functional for the inhomogeneous mixture,
we elaborate the extended the Gross-Pitaevskii equation applicable to both
symmetric and asymmetric mixtures into a universal form, and the equations in
two different dimensionless schemes are in a duality relation, i.e. the unique
parameters left are inverse of each other. The Bogoliubov equations for the
trapped droplet are obtained by linearizing the small density fluctuation
around the ground state and the low-lying excitation modes are calculated
numerically.It is found that the confinement trap changes easily the flat-top
structure for large droplets and alters the mean square radius and the chemical
potential intensively. The breathing mode of the confined droplet connects the
self-bound and ideal gas limits, with the excitation in the weakly interacting
Bose condensate for large particle numbers lying in between. We explicitly show
how the continuum spectrum of the excitation is split into discrete modes, and
finally taken over by the harmonic trap. Two critical particle numbers are
identified by the minimum size of the trapped droplet and the maximum breathing
mode energy, both of which are found to decrease exponentially with the
trapping parameter.Comment: 11 pages, 7 figure
ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze Variation
Gaze estimation is a fundamental task in many applications of computer
vision, human computer interaction and robotics. Many state-of-the-art methods
are trained and tested on custom datasets, making comparison across methods
challenging. Furthermore, existing gaze estimation datasets have limited head
pose and gaze variations, and the evaluations are conducted using different
protocols and metrics. In this paper, we propose a new gaze estimation dataset
called ETH-XGaze, consisting of over one million high-resolution images of
varying gaze under extreme head poses. We collect this dataset from 110
participants with a custom hardware setup including 18 digital SLR cameras and
adjustable illumination conditions, and a calibrated system to record ground
truth gaze targets. We show that our dataset can significantly improve the
robustness of gaze estimation methods across different head poses and gaze
angles. Additionally, we define a standardized experimental protocol and
evaluation metric on ETH-XGaze, to better unify gaze estimation research going
forward. The dataset and benchmark website are available at
https://ait.ethz.ch/projects/2020/ETH-XGazeComment: Accepted at ECCV 2020 (Spotlight
GazeNeRF: 3D-Aware Gaze Redirection with Neural Radiance Fields
We propose GazeNeRF, a 3D-aware method for the task of gaze redirection.
Existing gaze redirection methods operate on 2D images and struggle to generate
3D consistent results. Instead, we build on the intuition that the face region
and eyeballs are separate 3D structures that move in a coordinated yet
independent fashion. Our method leverages recent advancements in conditional
image-based neural radiance fields and proposes a two-stream architecture that
predicts volumetric features for the face and eye regions separately. Rigidly
transforming the eye features via a 3D rotation matrix provides fine-grained
control over the desired gaze angle. The final, redirected image is then
attained via differentiable volume compositing. Our experiments show that this
architecture outperforms naively conditioned NeRF baselines as well as previous
state-of-the-art 2D gaze redirection methods in terms of redirection accuracy
and identity preservation
Bioactive Constituents of Verbena officinalis Alleviate Inflammation and Enhance Killing Efficiency of Natural Killer Cells
Natural killer (NK) cells play key roles in eliminating pathogen-infected cells. Verbena
officinalis (V. officinalis) has been used as a medical plant in traditional and modern medicine for
its anti-tumor and anti-inflammatory activities, but its effects on immune responses remain largely
elusive. This study aimed to investigate the potential of V. officinalis extract (VO extract) to regulate
inflammation and NK cell functions. We examined the effects of VO extract on lung injury in a mouse
model of influenza virus infection. We also investigated the impact of five bioactive components of
VO extract on NK killing functions using primary human NK cells. Our results showed that oral
administration of VO extract reduced lung injury, promoted the maturation and activation of NK
cells in the lung, and decreased the levels of inflammatory cytokines (IL-6, TNF-α and IL-1β) in the
serum. Among five bioactive components of VO extract, Verbenalin significantly enhanced NK killing
efficiency in vitro, as determined by real-time killing assays based on plate-reader or high-content
live-cell imaging in 3D using primary human NK cells. Further investigation showed that treatment
of Verbenalin accelerated the killing process by reducing the contact time of NK cells with their
target cells without affecting NK cell proliferation, expression of cytotoxic proteins, or lytic granule
degranulation. Together, our findings suggest that VO extract has a satisfactory anti-inflammatory
effect against viral infection in vivo, and regulates the activation, maturation, and killing functions of
NK cells. Verbenalin from V. officinalis enhances NK killing efficiency, suggesting its potential as a
promising therapeutic to fight viral infection
Adversarial Attacks on Classifiers for Eye-based User Modelling
An ever-growing body of work has demonstrated the rich information content
available in eye movements for user modelling, e.g. for predicting users'
activities, cognitive processes, or even personality traits. We show that
state-of-the-art classifiers for eye-based user modelling are highly vulnerable
to adversarial examples: small artificial perturbations in gaze input that can
dramatically change a classifier's predictions. We generate these adversarial
examples using the Fast Gradient Sign Method (FGSM) that linearises the
gradient to find suitable perturbations. On the sample task of eye-based
document type recognition we study the success of different adversarial attack
scenarios: with and without knowledge about classifier gradients (white-box vs.
black-box) as well as with and without targeting the attack to a specific
class, In addition, we demonstrate the feasibility of defending against
adversarial attacks by adding adversarial examples to a classifier's training
data.Comment: 9 pages, 7 figure